A Naı̈ve Evaluation of Queries over Incomplete Databases
نویسندگان
چکیده
framework for powerset semantics. We now cast the powerset semantics in our general relation-based framework, which enables us to establish when naı̈ve evaluation works for it. For a set D of database objects and a set C of complete objects, we have a pairR “ pRval,Rsemq of relations withRval Ď Dˆ2 C andRsem Ď 2 ˆC. The first relation corresponds to applying multiple valuations (e.g., relating D with sets th1pDq, . . . , hnpDqu). The second relation, in our example, is RY “ tpX , Xq | X “ Ť X u. The semantics given by R is again the composition of two relations: D P rrDss R iff DpRval ̋RsemqD. The basic conditions on these relations are essentially the same as we used before for non-powerset semantics except that we need to deal with relations between C and 2. Let idl Ď Cˆ 2 C contain precisely all pairs pc, tcuq and idr Ď 2 C ˆ C contain precisely all pairs ptcu, cq for c P C. We say that a semantics rr ss R is given by R if both relations are total, relation Rval equals idl when restricted to C, relation Rsem contains idr, and D P rrDss R iffDpRval ̋RsemqD . Previously we just used identity instead of idl and idr. We say thatRsem is transitive ifRsem ̋ idl ̋Rsem Ď Rsem. Note thatRY is transitive. Now we have an analog of Proposition 4.2. PROPOSITION 10.2. A pair R “ pRval,Rsemq gives rise to a fair database domain if Rsem is transitive. PROOF. We prove a more general necessary and sufficient condition for fairness: LEMMA 10.3. A powerset semantics given by R “ pRval,Rsemq gives rise to a fair database domain iff Rval ̋ Rsem ̋ idl ̋ Rsem Ď Rval ̋ Rsem. In particular if Rsem is transitive then the database domain is fair. PROOF. Assume first thatRval ̋Rsem ̋idl ̋Rsem Ď Rval ̋Rsem, and take an arbitrary x P D and c P C. We have (1) c P rrcss R . Indeed we know pc, tcuq P Rval and ptcu, cq P Rsem, then c P rrcssR. (2) c P rrxss R implies rrcss R Ď rrxss R . Indeed if c P rrxss R there exists y Ď C such that px, yq P Rval and py, cq P Rsem. Moreover if c P rrcss R then pc, cq P idl ̋ Rsem (because Rval is idl when restricted to C). Hence px, cq P Rval ̋Rsem ̋ idl ̋Rsem. This implies px, c q P Rval ̋Rsem, and therefore c P rrxss R . ACM Transactions on Database Systems, Vol. V, No. N, Article A, Publication date: January YYYY.
منابع مشابه
Approximation Algorithms for Computing Certain Answers over Incomplete Databases
Certain answers are a widely accepted semantics of query answering over incomplete databases. Since their computation is a coNP-hard problem, recent research has focused on developing evaluation algorithms with correctness guarantees, that is, techniques computing a sound but possibly incomplete set of certain answers. In this paper, we show how novel evaluation algorithms with correctness guar...
متن کاملComputing Approximate Certain Answers over Incomplete Databases
Certain answers are a widely accepted semantics of query answering over incomplete databases. Since their computation is a coNP-hard problem, recent research has focused on developing evaluation algorithms with correctness guarantees, that is, techniques computing a sound but possibly incomplete set of certain answers. In this paper, we show how novel evaluation algorithms with correctness guar...
متن کاملWhen is Näıve Evaluation Possible?
The term näıve evaluation refers to evaluating queries over incomplete databases as if nulls were usual data values, i.e., to using the standard database query evaluation engine. Since the semantics of query answering over incomplete databases is that of certain answers, we would like to know when näıve evaluation computes them: i.e., when certain answers can be found without inventing new spec...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملManaging Probabilistic Data with MystiQ: The Can-Do, the Could-Do, and the Can't-Do
MystiQ is a system that allows users to define a probabilistic database, then to evaluate SQL queries over this database. MystiQ is a middleware: the data itself is stored in a standard relational database system, and MystiQ is providing the probabilistic semantics. The advantage of a middleware over a reimplementation from scratch is that it can leverage the infrastructure of an existing datab...
متن کاملIndexing Incomplete Databases
Incomplete databases, that is, databases that are missing data, are present in many research domains. It is important to derive techniques to access these databases efficiently. We first show that known indexing techniques for multi-dimensional data search break down in terms of performance when indexed attributes contain missing data. This paper utilizes two popularly employed indexing techniq...
متن کامل